Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 165
Filtrar
1.
Nucleic Acids Res ; 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38634808

RESUMO

Evaluating pharmacokinetic properties of small molecules is considered a key feature in most drug development and high-throughput screening processes. Generally, pharmacokinetics, which represent the fate of drugs in the human body, are described from four perspectives: absorption, distribution, metabolism and excretion-all of which are closely related to a fifth perspective, toxicity (ADMET). Since obtaining ADMET data from in vitro, in vivo or pre-clinical stages is time consuming and expensive, many efforts have been made to predict ADMET properties via computational approaches. However, the majority of available methods are limited in their ability to provide pharmacokinetics and toxicity for diverse targets, ensure good overall accuracy, and offer ease of use, interpretability and extensibility for further optimizations. Here, we introduce Deep-PK, a deep learning-based pharmacokinetic and toxicity prediction, analysis and optimization platform. We applied graph neural networks and graph-based signatures as a graph-level feature to yield the best predictive performance across 73 endpoints, including 64 ADMET and 9 general properties. With these powerful models, Deep-PK supports molecular optimization and interpretation, aiding users in optimizing and understanding pharmacokinetics and toxicity for given input molecules. The Deep-PK is freely available at https://biosig.lab.uq.edu.au/deeppk/.

2.
Cardiovasc Diabetol ; 23(1): 91, 2024 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-38448993

RESUMO

BACKGROUND: Recent guidelines propose N-terminal pro-B-type natriuretic peptide (NT-proBNP) for recognition of asymptomatic left ventricular (LV) dysfunction (Stage B Heart Failure, SBHF) in type 2 diabetes mellitus (T2DM). Wavelet Transform based signal-processing transforms electrocardiogram (ECG) waveforms into an energy distribution waveform (ew)ECG, providing frequency and energy features that machine learning can use as additional inputs to improve the identification of SBHF. Accordingly, we sought whether machine learning model based on ewECG features was superior to NT-proBNP, as well as a conventional screening tool-the Atherosclerosis Risk in Communities (ARIC) HF risk score, in SBHF screening among patients with T2DM. METHODS: Participants in two clinical trials of SBHF (defined as diastolic dysfunction [DD], reduced global longitudinal strain [GLS ≤ 18%] or LV hypertrophy [LVH]) in T2DM underwent 12-lead ECG with additional ewECG feature and echocardiography. Supervised machine learning was adopted to identify the optimal combination of ewECG extracted features for SBHF screening in 178 participants in one trial and tested in 97 participants in the other trial. The accuracy of the ewECG model in SBHF screening was compared with NT-proBNP and ARIC HF. RESULTS: SBHF was identified in 128 (72%) participants in the training dataset (median 72 years, 41% female) and 64 (66%) in the validation dataset (median 70 years, 43% female). Fifteen ewECG features showed an area under the curve (AUC) of 0.81 (95% CI 0.787-0.794) in identifying SBHF, significantly better than both NT-proBNP (AUC 0.56, 95% CI 0.44-0.68, p < 0.001) and ARIC HF (AUC 0.67, 95%CI 0.56-0.79, p = 0.002). ewECG features were also led to robust models screening for DD (AUC 0.74, 95% CI 0.73-0.74), reduced GLS (AUC 0.76, 95% CI 0.73-0.74) and LVH (AUC 0.90, 95% CI 0.88-0.89). CONCLUSIONS: Machine learning based modelling using additional ewECG extracted features are superior to NT-proBNP and ARIC HF in SBHF screening among patients with T2DM, providing an alternative HF screening strategy for asymptomatic patients and potentially act as a guidance tool to determine those who required echocardiogram to confirm diagnosis. Trial registration LEAVE-DM, ACTRN 12619001393145 and Vic-ELF, ACTRN 12617000116325.


Assuntos
Aterosclerose , Diabetes Mellitus Tipo 2 , Humanos , Feminino , Masculino , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/diagnóstico , Eletrocardiografia , Ecocardiografia , Fatores de Risco , Hipertrofia Ventricular Esquerda
3.
J Clin Invest ; 134(4)2024 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-38357931

RESUMO

Nicotinamide adenine dinucleotide (NAD) is essential for embryonic development. To date, biallelic loss-of-function variants in 3 genes encoding nonredundant enzymes of the NAD de novo synthesis pathway - KYNU, HAAO, and NADSYN1 - have been identified in humans with congenital malformations defined as congenital NAD deficiency disorder (CNDD). Here, we identified 13 further individuals with biallelic NADSYN1 variants predicted to be damaging, and phenotypes ranging from multiple severe malformations to the complete absence of malformation. Enzymatic assessment of variant deleteriousness in vitro revealed protein domain-specific perturbation, complemented by protein structure modeling in silico. We reproduced NADSYN1-dependent CNDD in mice and assessed various maternal NAD precursor supplementation strategies to prevent adverse pregnancy outcomes. While for Nadsyn1+/- mothers, any B3 vitamer was suitable to raise NAD, preventing embryo loss and malformation, Nadsyn1-/- mothers required supplementation with amidated NAD precursors (nicotinamide or nicotinamide mononucleotide) bypassing their metabolic block. The circulatory NAD metabolome in mice and humans before and after NAD precursor supplementation revealed a consistent metabolic signature with utility for patient identification. Our data collectively improve clinical diagnostics of NADSYN1-dependent CNDD, provide guidance for the therapeutic prevention of CNDD, and suggest an ongoing need to maintain NAD levels via amidated NAD precursor supplementation after birth.


Assuntos
Carbono-Nitrogênio Ligases com Glutamina como Doadora de N-Amida , NAD , Feminino , Gravidez , Humanos , Camundongos , Animais , NAD/metabolismo , Niacinamida , Fenótipo , Metaboloma , Carbono-Nitrogênio Ligases com Glutamina como Doadora de N-Amida/metabolismo
4.
Artigo em Inglês | MEDLINE | ID: mdl-38180643

RESUMO

Glycoside hydrolases (GHs) are a diverse group of enzymes that catalyze the hydrolysis of glycosidic bonds. The Carbohydrate-Active enZymes (CAZy) classification organizes GHs into families based on sequence data and function, with fewer than 1% of the predicted proteins characterized biochemically. Consideration of genomic context can provide clues to infer possible enzyme activities for proteins of unknown function. We used the MultiGeneBLAST tool to discover a gene cluster in Marinovum sp., a member of the marine Roseobacter clade, that encodes homologues of enzymes belonging to the sulfoquinovose monooxygenase pathway for sulfosugar catabolism. This cluster lacks a gene encoding a classical family GH31 sulfoquinovosidase candidate, but which instead includes an uncharacterized family GH13 protein (MsGH13) that we hypothesized could be a non-classical sulfoquinovosidase. Surprisingly, recombinant MsGH13 lacks sulfoquinovosidase activity and is a broad-spectrum α-glucosidase that is active on a diverse array of α-linked disaccharides, including maltose, sucrose, nigerose, trehalose, isomaltose, and kojibiose. Using AlphaFold, a 3D model for the MsGH13 enzyme was constructed that predicted its active site shared close similarity with an α-glucosidase from Halomonas sp. H11 of the same GH13 subfamily that shows narrower substrate specificity.

5.
Curr Opin Pharmacol ; 74: 102427, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38219398

RESUMO

This article investigates the role of recent advances in Artificial Intelligence (AI) to revolutionise the study of G protein-coupled receptors (GPCRs). AI has been applied to many areas of GPCR research, including the application of machine learning (ML) in GPCR classification, prediction of GPCR activation levels, modelling GPCR 3D structures and interactions, understanding G-protein selectivity, aiding elucidation of GPCRs structures, and drug design. Despite progress, challenges in predicting GPCR structures and addressing the complex nature of GPCRs remain, providing avenues for future research and development.


Assuntos
Inteligência Artificial , Receptores Acoplados a Proteínas G , Humanos , Receptores Acoplados a Proteínas G/química , Aprendizado de Máquina
6.
Hum Genet ; 2024 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-38227011

RESUMO

Missense mutations are known contributors to diverse genetic disorders, due to their subtle, single amino acid changes imparted on the resultant protein. Because of this, understanding the impact of these mutations on protein stability and function is crucial for unravelling disease mechanisms and developing targeted therapies. The Critical Assessment of Genome Interpretation (CAGI) provides a valuable platform for benchmarking state-of-the-art computational methods in predicting the impact of disease-related mutations on protein thermodynamics. Here we report the performance of our comprehensive platform of structure-based computational approaches to evaluate mutations impacting protein structure and function on 3 challenges from CAGI6: Calmodulin, MAPK1 and MAPK3. Our stability predictors have achieved correlations of up to 0.74 and AUCs of 1 when predicting changes in ΔΔG for MAPK1 and MAPK3, respectively, and AUC of up to 0.75 in the Calmodulin challenge. Overall, our study highlights the importance of structure-based approaches in understanding the effects of missense mutations on protein thermodynamics. The results obtained from the CAGI6 challenges contribute to the ongoing efforts to enhance our understanding of disease mechanisms and facilitate the development of personalised medicine approaches.

7.
Hum Mol Genet ; 33(3): 224-232, 2024 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-37883464

RESUMO

BACKGROUND: Mutations within the Von Hippel-Lindau (VHL) tumor suppressor gene are known to cause VHL disease, which is characterized by the formation of cysts and tumors in multiple organs of the body, particularly clear cell renal cell carcinoma (ccRCC). A major challenge in clinical practice is determining tumor risk from a given mutation in the VHL gene. Previous efforts have been hindered by limited available clinical data and technological constraints. METHODS: To overcome this, we initially manually curated the largest set of clinically validated VHL mutations to date, enabling a robust assessment of existing predictive tools on an independent test set. Additionally, we comprehensively characterized the effects of mutations within VHL using in silico biophysical tools describing changes in protein stability, dynamics and affinity to binding partners to provide insights into the structure-phenotype relationship. These descriptive properties were used as molecular features for the construction of a machine learning model, designed to predict the risk of ccRCC development as a result of a VHL missense mutation. RESULTS: Analysis of our model showed an accuracy of 0.81 in the identification of ccRCC-causing missense mutations, and a Matthew's Correlation Coefficient of 0.44 on a non-redundant blind test, a significant improvement in comparison to the previous available approaches. CONCLUSION: This work highlights the power of using protein 3D structure to fully explore the range of molecular and functional consequences of genomic variants. We believe this optimized model will better enable its clinical implementation and assist guiding patient risk stratification and management.


Assuntos
Aprendizado de Máquina , Mutação de Sentido Incorreto , Doença de von Hippel-Lindau , Humanos , Carcinoma de Células Renais/genética , Carcinoma de Células Renais/metabolismo , Neoplasias Renais/metabolismo , Mutação de Sentido Incorreto/genética , Doença de von Hippel-Lindau/genética , Doença de von Hippel-Lindau/patologia , Proteína Supressora de Tumor Von Hippel-Lindau/genética , Proteína Supressora de Tumor Von Hippel-Lindau/química , Proteína Supressora de Tumor Von Hippel-Lindau/metabolismo
8.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-38018912

RESUMO

Dysfunctions caused by missense mutations in the tumour suppressor p53 have been extensively shown to be a leading driver of many cancers. Unfortunately, it is time-consuming and labour-intensive to experimentally elucidate the effects of all possible missense variants. Recent works presented a comprehensive dataset and machine learning model to predict the functional outcome of mutations in p53. Despite the well-established dataset and precise predictions, this tool was trained on a complicated model with limited predictions on p53 mutations. In this work, we first used computational biophysical tools to investigate the functional consequences of missense mutations in p53, informing a bias of deleterious mutations with destabilizing effects. Combining these insights with experimental assays, we present two interpretable machine learning models leveraging both experimental assays and in silico biophysical measurements to accurately predict the functional consequences on p53 and validate their robustness on clinical data. Our final model based on nine features obtained comparable predictive performance with the state-of-the-art p53 specific method and outperformed other generalized, widely used predictors. Interpreting our models revealed that information on residue p53 activity, polar atom distances and changes in p53 stability were instrumental in the decisions, consistent with a bias of the properties of deleterious mutations. Our predictions have been computed for all possible missense mutations in p53, offering clinical diagnostic utility, which is crucial for patient monitoring and the development of personalized cancer treatment.


Assuntos
Mutação de Sentido Incorreto , Neoplasias , Humanos , Proteína Supressora de Tumor p53/genética , Mutação , Neoplasias/genética , Aprendizado de Máquina
9.
Proteins ; 2023 Oct 23.
Artigo em Inglês | MEDLINE | ID: mdl-37870486

RESUMO

Proteins are molecular machinery that participate in virtually all essential biological functions within the cell, which are tightly related to their 3D structure. The importance of understanding protein structure-function relationship is highlighted by the exponential growth of experimental structures, which has been greatly expanded by recent breakthroughs in protein structure prediction, most notably RosettaFold, and AlphaFold2. These advances have prompted the development of several computational approaches that leverage these data sources to explore potential biological interactions. However, most methods are generally limited to analysis of single types of interactions, such as protein-protein or protein-ligand interactions, and their complexity limits the usability to expert users. Here we report CSM-Potential2, a deep learning platform for the analysis of binding interfaces on protein structures. In addition to prediction of protein-protein interactions binding sites and classification of biological ligands, our new platform incorporates prediction of interactions with nucleic acids at the residue level and allows for ligand transplantation based on sequence and structure similarity to experimentally determined structures. We anticipate our platform to be a valuable resource that provides easy access to a range of state-of-the-art methods to expert and non-expert users for the study of biological interactions. Our tool is freely available as an easy-to-use web server and API available at https://biosig.lab.uq.edu.au/csm_potential.

10.
Genes (Basel) ; 14(10)2023 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-37895239

RESUMO

Variants in non-homologous end joining (NHEJ) DNA repair genes are associated with various human syndromes, including microcephaly, growth delay, Fanconi anemia, and different hereditary cancers. However, very little has been done previously to systematically record the underlying molecular consequences of NHEJ variants and their link to phenotypic outcomes. In this study, a list of over 2983 missense variants of the principal components of the NHEJ system, including DNA Ligase IV, DNA-PKcs, Ku70/80 and XRCC4, reported in the clinical literature, was initially collected. The molecular consequences of variants were evaluated using in silico biophysical tools to quantitatively assess their impact on protein folding, dynamics, stability, and interactions. Cancer-causing and population variants within these NHEJ factors were statistically analyzed to identify molecular drivers. A comprehensive catalog of NHEJ variants from genes known to be mutated in cancer was curated, providing a resource for better understanding their role and molecular mechanisms in diseases. The variant analysis highlighted different molecular drivers among the distinct proteins, where cancer-driving variants in anchor proteins, such as Ku70/80, were more likely to affect key protein-protein interactions, whilst those in the enzymatic components, such as DNA-PKcs, were likely to be found in intolerant regions undergoing purifying selection. We believe that the information acquired in our database will be a powerful resource to better understand the role of non-homologous end-joining DNA repair in genetic disorders, and will serve as a source to inspire other investigations to understand the disease further, vital for the development of improved therapeutic strategies.


Assuntos
Reparo do DNA por Junção de Extremidades , Neoplasias , Humanos , Reparo do DNA por Junção de Extremidades/genética , Reparo do DNA/genética , DNA/genética
11.
medRxiv ; 2023 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-37732177

RESUMO

CRISPR base editing screens are powerful tools for studying disease-associated variants at scale. However, the efficiency and precision of base editing perturbations vary, confounding the assessment of variant-induced phenotypic effects. Here, we provide an integrated pipeline that improves the estimation of variant impact in base editing screens. We perform high-throughput ABE8e-SpRY base editing screens with an integrated reporter construct to measure the editing efficiency and outcomes of each gRNA alongside their phenotypic consequences. We introduce BEAN, a Bayesian network that accounts for per-guide editing outcomes and target site chromatin accessibility to estimate variant impacts. We show this pipeline attains superior performance compared to existing tools in variant classification and effect size quantification. We use BEAN to pinpoint common variants that alter LDL uptake, implicating novel genes. Additionally, through saturation base editing of LDLR, we enable accurate quantitative prediction of the effects of missense variants on LDL-C levels, which aligns with measurements in UK Biobank individuals, and identify structural mechanisms underlying variant pathogenicity. This work provides a widely applicable approach to improve the power of base editor screens for disease-associated variant characterization.

12.
Genes (Basel) ; 14(9)2023 08 26.
Artigo em Inglês | MEDLINE | ID: mdl-37761839

RESUMO

The development and approval of antivirals against SARS-CoV-2 has further equipped clinicians with treatment strategies against the COVID-19 pandemic, reducing deaths post-infection. Extensive clinical use of antivirals, however, can impart additional selective pressure, leading to the emergence of antiviral resistance. While we have previously characterized possible effects of circulating SARS-CoV-2 missense mutations on proteome function and stability, their direct effects on the novel antivirals remains unexplored. To address this, we have computationally calculated the consequences of mutations in the antiviral targets: RNA-dependent RNA polymerase and main protease, on target stability and interactions with their antiviral, nucleic acids, and other proteins. By analyzing circulating variants prior to antiviral approval, this work highlighted the inherent resistance potential of different genome regions. Namely, within the main protease binding site, missense mutations imparted a lower fitness cost, while the opposite was noted for the RNA-dependent RNA polymerase binding site. This suggests that resistance to nirmatrelvir/ritonavir combination treatment is more likely to occur and proliferate than that to molnupiravir. These insights are crucial both clinically in drug stewardship, and preclinically in the identification of less mutable targets for novel therapeutic design.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Antivirais/farmacologia , Antivirais/uso terapêutico , COVID-19/genética , Pandemias , Peptídeo Hidrolases
13.
Nucleic Acids Res ; 51(W1): W122-W128, 2023 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-37283042

RESUMO

Understanding the effects of mutations on protein stability is crucial for variant interpretation and prioritisation, protein engineering, and biotechnology. Despite significant efforts, community assessments of predictive tools have highlighted ongoing limitations, including computational time, low predictive power, and biased predictions towards destabilising mutations. To fill this gap, we developed DDMut, a fast and accurate siamese network to predict changes in Gibbs Free Energy upon single and multiple point mutations, leveraging both forward and hypothetical reverse mutations to account for model anti-symmetry. Deep learning models were built by integrating graph-based representations of the localised 3D environment, with convolutional layers and transformer encoders. This combination better captured the distance patterns between atoms by extracting both short-range and long-range interactions. DDMut achieved Pearson's correlations of up to 0.70 (RMSE: 1.37 kcal/mol) on single point mutations, and 0.70 (RMSE: 1.84 kcal/mol) on double/triple mutants, outperforming most available methods across non-redundant blind test sets. Importantly, DDMut was highly scalable and demonstrated anti-symmetric performance on both destabilising and stabilising mutations. We believe DDMut will be a useful platform to better understand the functional consequences of mutations, and guide rational protein engineering. DDMut is freely available as a web server and API at https://biosig.lab.uq.edu.au/ddmut.


Assuntos
Aprendizado Profundo , Estabilidade Proteica , Proteínas , Software , Mutação , Mutação Puntual , Proteínas/química , Proteínas/genética
14.
Bioinformatics ; 39(7)2023 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-37382557

RESUMO

MOTIVATION: While antibodies have been ground-breaking therapeutic agents, the structural determinants for antibody binding specificity remain to be fully elucidated, which is compounded by the virtually unlimited repertoire of antigens they can recognize. Here, we have explored the structural landscapes of antibody-antigen interfaces to identify the structural determinants driving target recognition by assessing concavity and interatomic interactions. RESULTS: We found that complementarity-determining regions utilized deeper concavity with their longer H3 loops, especially H3 loops of nanobody showing the deepest use of concavity. Of all amino acid residues found in complementarity-determining regions, tryptophan used deeper concavity, especially in nanobodies, making it suitable for leveraging concave antigen surfaces. Similarly, antigens utilized arginine to bind to deeper pockets of the antibody surface. Our findings fill a gap in knowledge about the antibody specificity, binding affinity, and the nature of antibody-antigen interface features, which will lead to a better understanding of how antibodies can be more effective to target druggable sites on antigen surfaces. AVAILABILITY AND IMPLEMENTATION: The data and scripts are available at: https://github.com/YoochanMyung/scripts.


Assuntos
Anticorpos , Regiões Determinantes de Complementaridade , Regiões Determinantes de Complementaridade/química , Anticorpos/química , Antígenos , Especificidade de Anticorpos , Sítios de Ligação de Anticorpos
15.
Bioinformatics ; 39(7)2023 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-37382560

RESUMO

MOTIVATION: With the development of sequencing techniques, the discovery of new proteins significantly exceeds the human capacity and resources for experimentally characterizing protein functions. Localization, EC numbers, and GO terms with the structure-based Cutoff Scanning Matrix (LEGO-CSM) is a comprehensive web-based resource that fills this gap by leveraging the well-established and robust graph-based signatures to supervised learning models using both protein sequence and structure information to accurately model protein function in terms of Subcellular Localization, Enzyme Commission (EC) numbers, and Gene Ontology (GO) terms. RESULTS: We show our models perform as well as or better than alternative approaches, achieving area under the receiver operating characteristic curve of up to 0.93 for subcellular localization, up to 0.93 for EC, and up to 0.81 for GO terms on independent blind tests. AVAILABILITY AND IMPLEMENTATION: LEGO-CSM's web server is freely available at https://biosig.lab.uq.edu.au/lego_csm. In addition, all datasets used to train and test LEGO-CSM's models can be downloaded at https://biosig.lab.uq.edu.au/lego_csm/data.


Assuntos
Proteínas , Software , Humanos , Proteínas/química
16.
Int J Mol Sci ; 24(12)2023 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-37373306

RESUMO

Human aldehyde dehydrogenases (ALDHs) comprising 19 isoenzymes play a vital role on both endogenous and exogenous aldehyde metabolism. This NAD(P)-dependent catalytic process relies on the intact structural and functional activity of the cofactor binding, substrate interaction, and the oligomerization of ALDHs. Disruptions on the activity of ALDHs, however, could result in the accumulation of cytotoxic aldehydes, which have been linked with a wide range of diseases, including both cancers as well as neurological and developmental disorders. In our previous works, we have successfully characterised the structure-function relationships of the missense variants of other proteins. We, therefore, applied a similar analysis pipeline to identify potential molecular drivers of pathogenic ALDH missense mutations. Variants data were first carefully curated and labelled as cancer-risk, non-cancer diseases, and benign. We then leveraged various computational biophysical methods to describe the changes caused by missense mutations, informing a bias of detrimental mutations with destabilising effects. Cooperating with these insights, several machine learning approaches were further utilised to investigate the combination of features, revealing the necessity of the conservation of ALDHs. Our work aims to provide important biological perspectives on pathogenic consequences of missense mutations of ALDHs, which could be invaluable resources in the development of cancer treatment.


Assuntos
Aldeído Desidrogenase , Neoplasias , Humanos , Aldeído Desidrogenase/metabolismo , Mutação de Sentido Incorreto , Neoplasias/genética , Aldeídos
17.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37039696

RESUMO

The ability to identify B-cell epitopes is an essential step in vaccine design, immunodiagnostic tests and antibody production. Several computational approaches have been proposed to identify, from an antigen protein or peptide sequence, which residues are more likely to be part of an epitope, but have limited performance on relatively homogeneous data sets and lack interpretability, limiting biological insights that could otherwise be obtained. To address these limitations, we have developed epitope1D, an explainable machine learning method capable of accurately identifying linear B-cell epitopes, leveraging two new descriptors: a graph-based signature representation of protein sequences, based on our well-established Cutoff Scanning Matrix algorithm and Organism Ontology information. Our model achieved Areas Under the ROC curve of up to 0.935 on cross-validation and blind tests, demonstrating robust performance. A comprehensive comparison to alternative methods using distinct benchmark data sets was also employed, with our model outperforming state-of-the-art tools. epitope1D represents not only a significant advance in predictive performance, but also allows biologically meaningful features to be combined and used for model interpretation. epitope1D has been made available as a user-friendly web server interface and application programming interface at https://biosig.lab.uq.edu.au/epitope1d/.


Assuntos
Algoritmos , Epitopos de Linfócito B , Sequência de Aminoácidos , Curva ROC
18.
Int J Mol Sci ; 24(6)2023 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-36982187

RESUMO

Developmental and epileptic encephalopathies (DEEs) are a group of epilepsies with early onset and severe symptoms that sometimes lead to death. Although previous work successfully discovered several genes implicated in disease outcomes, it remains challenging to identify causative mutations within these genes from the background variation present in all individuals due to disease heterogeneity. Nevertheless, our ability to detect possible pathogenic variants has continued to improve as in silico predictors of deleteriousness have advanced. We investigate their use in prioritising likely pathogenic variants in epileptic encephalopathy patients' whole exome sequences. We showed that the inclusion of structure-based predictors of intolerance improved upon previous attempts to demonstrate enrichment within epilepsy genes.


Assuntos
Epilepsia Generalizada , Epilepsia , Humanos , Fenótipo , Epilepsia/genética , Mutação
19.
Pharmaceutics ; 15(2)2023 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-36839752

RESUMO

Biologics are one of the most rapidly expanding classes of therapeutics, but can be associated with a range of toxic properties. In small-molecule drug development, early identification of potential toxicity led to a significant reduction in clinical trial failures, however we currently lack robust qualitative rules or predictive tools for peptide- and protein-based biologics. To address this, we have manually curated the largest set of high-quality experimental data on peptide and protein toxicities, and developed CSM-Toxin, a novel in-silico protein toxicity classifier, which relies solely on the protein primary sequence. Our approach encodes the protein sequence information using a deep learning natural languages model to understand "biological" language, where residues are treated as words and protein sequences as sentences. The CSM-Toxin was able to accurately identify peptides and proteins with potential toxicity, achieving an MCC of up to 0.66 across both cross-validation and multiple non-redundant blind tests, outperforming other methods and highlighting the robust and generalisable performance of our model. We strongly believe the CSM-Toxin will serve as a valuable platform to minimise potential toxicity in the biologic development pipeline. Our method is freely available as an easy-to-use webserver.

20.
J Chem Inf Model ; 63(2): 432-441, 2023 01 23.
Artigo em Inglês | MEDLINE | ID: mdl-36595441

RESUMO

Teratogenic drugs can lead to extreme fetal malformation and consequently critically influence the fetus's health, yet the teratogenic risks associated with most approved drugs are unknown. Here, we propose a novel predictive tool, embryoTox, which utilizes a graph-based signature representation of the chemical structure of a small molecule to predict and classify molecules likely to be safe during pregnancy. embryoTox was trained and validated using in vitro bioactivity data of over 700 small molecules with characterized teratogenicity effects. Our final model achieved an area under the receiver operating characteristic curve (AUC) of up to 0.96 on 10-fold cross-validation and 0.82 on nonredundant blind tests, outperforming alternative approaches. We believe that our predictive tool will provide a practical resource for optimizing screening libraries to determine effective and safe molecules to use during pregnancy. To provide a simple and integrated platform to rapidly screen for potential safe molecules and their risk factors, we made embryoTox freely available online at https://biosig.lab.uq.edu.au/embryotox/.


Assuntos
Projetos de Pesquisa , Gravidez , Feminino , Humanos , Curva ROC
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...